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FOREWORD 


Within  the  U.S.  Army  Research  Institute  for  the  Behavioral  and 
Social  Sciences  (ARI),  research  in  enhancing  human  performance  by  es- 
tablishing the  limits  of  sensory  perception  is  an  important  facet  of 
the  Human  Factors  in  Tactical  Operations  Technical  Area.  Current  re- 
search emphasizes  visual  perception  in  night  operations.  Previous  re- 
search emphasizing  auditory  perception  is  reported  in  several  recent 
ARI  Technical  Papers.  Technical  Paper  295  discusses  the  comprehension 
of  time-compressed  speech  as  a function  of  training.  Technical  Paper 
296  reports  a method  for  measuring,  through  repeated  judgments  of- com- 
" prehensioility,  the  maximum  rate  of  speech  understood  by  individual 
listeners.  This  paper  attempts  to  determine  whether  the  method  is  a 
measure  of  speech  intelligibility  or  of  comprehension.  (A  modified 
version  of  the  text  was  published  in  Perception  & Psychophysics,  1977, 
Vol . 22  (4) , 366-672.) 

Results  of  the  work  reported  here,  done  under  Army  Project 
2T161101A91B,  are  applicable  to  a wide  variety  of  situations.  The 
comprehensibility  measures  may  be  used  whenever  there  is  a need  to 
supplement  traditional  measures  of  intelligibility  and  comprehension 
in  the  evaluation  of  the  understanding  of  speech. 
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BRIEF 


Requirement : 

The  Army's  continuing  need  for  improvement  of  human  communications 
in  tactical  environments  has  led  to  research  relating  the  rapidity  of 
speech  to  its  intelligibility  and  comprehension.  A previous  study 
(Technical  Paper  296)  described  a threshold  method  for  determining  the 
maximum  rate  of  speech  understood  by  an  individual  listener.  The  pur- 
pose of  the  present  study  is  to  determine  the  relationship  of  this 
speech  rapidity  threshold  to  the  intelligibility  and  comprehensibility 
of  speech. 

Procedure : 

Two  experiments  were  conducted  to  determine  whether  the  speech- 
rate  threshold  is  related  to  the  intelligibility  of  speech  or  to  speech 
comprehension.  The  first  experiment  compared  thresholds  for  two  types 
of  time-compressed  speech  reportedly  different  in  intelligibility: 

(1)  simple  speeded  speech  produced  by  increasing  the  playback  speed  of 
recorded  speech,  and  (2)  compressed  speech  produced  by  the  sampling 
method,  which  deletes  minute  sections  of  speech.  The  second  experiment 
investigated  the  relationship  of  the  threshold  to  comprehension  by 
means  of  traditional  multiple-choice  comprehension  measures. 


Findings : 

There  were  clear  indications  that  compressed  speech  is  more  intel- 
ligible than  speeded  speech.  Thresholds  for  speeded  and  compressed 
speech  differed  significantly  (218  wpm  vs.  266  wpm,  respectively),  which 
indicates  that  the  threshold  at  least  involves  intelligibility.  Corre- 
lational analysis  indicated  little  relationship  between  thresholds  and 
comprehension  test  scores.  The  conclusion,  therefore,  was  that  judg- 
ments of  comprehensibility  reflect  an  intermediate  step  in  information 
processing  that  involves  the  perception  of  potential  for  interpretation 
or  comprehension  rather  than  comprehension  per  se. 


The  threshold  may  be  used  in  a number  of  ways:  (1)  to  supplement 
such  traditional  methods  of  evaluating  the  understanding  of  spoken 
language  as  intelligibility  and  comprehension  testing;  (2)  to  determine 
individual  differences  in  speech  perception  which  may  be  related  to 
practical  skills  in  communication;  (3)  to  evaluate  the  quality  of 
speech  produced  by  devices  such  as  speech  compressors  and  speech  syn- 
thesizers. Perhaps  the  threshold  may  also  be  used  to  evaluate  the 
difficulty  of  recorded  spoken  materials  and  the  syntactic  and  semantic 
variacles  that  underlie  difficulty. 
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A SPEECH-RATE  INTELLIGIBILITY/COMPREHENSIBILITY 
THRESHOLD  FOR  SPEEDED  AND  TIME-COMPRESSED 
CONNECTED  SPEECH 


INTRODUCTION 

Although  the  relationship  of  the  intensity  of  speech  to  speech  per- 
ception has  been  extensively  investigated,  the  relationship  of  rapidity 
of  speech  to  its  perception  has  been  less  thoroughly  explored.  Recent 
technological  advances  have  made  it  possible  to  reproduce  speech  at 
rates  well  beyond  the  limits  of  human  capacity  to  understand  it  and 
have  enabled  investigators  to  explore  the  perception  of  rapid  speech. 

Such  speech  is  known  as  time-compressed,  or  simply  compressed,  speech. 

The  simplest  method  of  producing  time -compressed  speech  is  to  increase 
the  speed  of  a playback  device  above  the  speed  at  which  the  speech  was 
originally  recorded.  This  is  sometimes  referred  to  as  the  speed-changing 
method,  and  the  speech  so  produced  as  speeded  speech.  This  procedure 
increases  not  only  the  rate  of  speech  but  its  pitch  as  well,  because 
the  frequency  of  the  speech  components  varies  directly  with  playback 
speed. 

Another  method  of  producing  time-compressed  speech  is  the  so-called 
sampling  method;  speech  produced  this  way  is  usually  referred  to  as 
compressed  speech.  The  groundwork  for  this  type  of  speech  compression 
was  laid  by  Miller  and  Licklider  (1950) , who  found  that  interruptions 
of  speech  at  10  or  more  times  per  second  did  not  interfere  with  intel- 
ligibility of  a speech  signal  until  relatively  large  amounts  of  the 
signal  were  discarded.  Subsequently,  Garvey  (1953)  explored  the  ef- 
fects of  cutting  out  short  segments  of  a tape  recording  and  physically 
rejoining  the  remaining  parts.  The  resulting  speech  was  time-compressed 
without  the  distortion  in  pitch.  Fairbanks,  Everitt,  and  Jaeger  (1954) 
developed  an  electromechanical  device  to  compress  speech  without  the 
cumbersome  manual  manipulation  used  by  Garvey,  and  more  recently  a 
number  of  electronic  devices  have  been  developed  to  compress  speech 
by  the  sampling  method.  Such  devices  sample  the  speech  signal  at  very 
frequent  intervals  and  discard  a portion  of  each  sample. 

In  investigations  on  the  effects  of  the  rapidity  of  speech,  the 
methods  for  the  evaluation  of  speech  have  been  extensions  or  applica- 
tions of  methods  used  for  the  evaluation  of  conventional  speech, 
namely,  the  measurement  of  comprehension  and  the  measurement  of  in- 
telligibility. The  evaluation  of  compressed  speech  by  these  methods 
was  reviewed  by  Foulke  and  Sticht  (1969) . Comprehension  measurement 
is  applied  to  speech  materials  of  some  length,  such  as  connected  dis- 
course or  free-running  speech,  and  is  obtained  by  asking  questions 
concerning  the  content  of  the  materials,  usually  in  the  form  of  objec- 
tive multiple-choice  questionnaires.  A frequent  finding  is  that  com- 
prehension declines  as  the  rate  of  speech  increases.  Foulke  (1971) 
and  Foulke  and  Sticht  (1969)  reported  that  when  comprehension  of 


compressed  speech  passages  is  measured,  a rapid  decline  in  comprehen- 
sion is  found  above  a speech  rate  of  approximately  250  to  275  words  per 
minute  irrespective  of  the  word  rate  of  the  original  passage. 

Another  common  method  for  evaluating  speech,  intelligibility  test- 
ing, derives  from  investigations  of  the  ability  of  the  telephone  system 
to  transmit  speech.  The  method  has  been  used  to  investigate  the  inten- 
sive aspects  of  audition,  specifically,  the  ability  of  a person  to  hear 
speech.  In  the  articulation  test,  individuals  are  presented  with  brief 
messages  at  varying  intensities--usual ly  single  words,  but  occasionally 
phrases  or  short  sentences--and  asked  to  repeat  or  correctly  identify 
them.  The  threshold  of  intelligibility  is  defined  as  the  intensity  at 
which  50%  of  the  materials  so  presented  can  be  reproduced.  In  com- 
pressed speech  research,  similar  procedures  have  been  employed  with 
single  words  compressed  in  time.  For  example,  Garvey  u ’>3)  used  the 
percentage  of  compressed  words  correctly  identified  cs  tne  index  of 

! intelligibility;  Calearo  and  Lazzaroni  (1957)  determii  ed  the  threshold 

intensity  required  for  compressed  word  identification;  and  Koulke  (1969) 
used  reaction  time  for  the  identification  of  single  compressed  words 
as  an  index  of  intelligibility. 

The  above  methods  of  evaluating  intelligibility  are  confined  to 
brief  materials;  there  have  been,  however,  several  attempts  to  measure 
the  intelligibility  of  connected,  running  speech  (Chaiklin,  1959; 

Dahle,  Hume,  t.  Haspiel,  1968;  Falconer  f»  Davis,  1947;  Haspiel  & Havens, 
1966;  Hawkins  & Stevens,  1950;  LcZak,  Siegenthaler , & Davis,  1964; 

Speaks,  Parker,  Harris,  f.  Kuhl,  1972).  In  these  studies,  listeners 
were  asked  to  adjust  the  intensity  of  the  speech  until  they  could  just 
understand  it,  or  some  percentage  of  it.  Many  of  the  studies  used  the 
Bekesy  technique  to  determine  speech  intensity  thresholds,  and  all  of 
them  were  concerned  with  the  intensity  of  the  auditory  signal  required 
for  understanding  speech. 

Despite  the  preceding  work,  a threshold  method  for  determining 
the  maximum  rate  of  speech  understood  by  an  individual  based  on  varia- 
tions in  the  rate  of  speech  has  not  been  available.  Consequently,  a 
simple,  direct,  psychophysical  threshold  method  was  developed,  modeie.l 
after  the  automated  threshold  technique  of  Bekesy  (1947)  for  determina- 
tion of  the  auditory  threshold.  The  subject  is  required  to  respond  to 
and  control  the  changing  rate  of  speech  in  order  to  bracket  a threshold 
of  understanding.  Intensity  is  not  varied.  All  stimuli  are  supraliminal 
in  intensity  and  only  the  rapidity  of  speech  is  varied. 

The  method  assumes  that  as  speech  becomes  progressively  more  rapid 
a point  is  reached  where  the  individual  can  no  longer  understand  it. 
Accordingly,  an  attempt  is  made  to  determine  this  point  by  a threshold 
technique.  The  task  set  for  the  listener  is  essentially  a perceptual 
one:  He  must  perceive  the  point  at  which  he  fails  to  understand  speech 

as  its  rapidity  increases.  Little  published  evidence  exists  that  rate 
of  speech  perception  is  a variable  which  obeys  the  same  psychophysical 


k.  i 


relationships  as  other  perceptual  variables,  although  Hutton  (1955), 
using  quite  brief  stimulus  materials  (ranging  from  B.O  to  42. 6 sec  in 
duration),  found  that  perceived  word  rate  was  a logarithmic  function 
of  measured  rate. 

Both  intelligibility  and  comprehension  measurement  contributed  to 
the  conceptualization  of  the  threshold  of  understanding.  Foulke  and 
Sticht's  (1969)  conclusion  that  a more  rapid  decline  in  comprehension 
occurs  above  275  words  per  minute  implied  some  sort  of  threshold  of 
comprehension.  Carver  (1973a)  furnished  additional  support  for  a com- 
prehension threshold  by  deriving  a so-called  duration  measure  based  on 
seconds  per  word  instead  of  words  per  minute.  Thresholds  are  well  known 
in  intelligibility  testing.  Clarvey  (1953)  concluded  that  compressed 
speech  was  more  intelligible  than  speeded  speech  by  determining  the  in- 
telligibility of  single  words  by  means  of  the  articulation  test.  As 
described  earlier,  minimum  loudness  thresholds  for  understanding  con- 
nected speech,  including  speech- Bek4sy  methods,  also  have  been  developed. 


Because  a threshold  is  implied  in  both  intelligibility  and  compre- 
hension measurement,  a question  arose  as  to  which  of  these  constructs 
is  measured  by  the  present  threshold.  The  purpose  of  the  present  work, 
therefore,  was  to  determine  whether  the  threshold  was  related  to  the 
comprehension  of  speech  or  to  speech  intelligibility.  To  accomplish 
this,  two  experiments  were  performed.  The  first  compared  thresholds 
for  two  typos  of  compressed  speech  previously  reported  to  differ  in 
intelligibility:  simple  speeded  speech  produced  by  the  speed-changing 

method  and  compressed  speech  produced  by  the  sampling  method.  At  the 
same  time,  the  effect  of  four  different  magnitudes  of  rate  of  change 
of  speech  speed  (acceleration-deceleration)  was  studied.  The  second 
experiment  presented  several  speech  passages  compressed  by  different 
amounts,  and  determined  comprehension  of  the  passages  by  questionnaires. 
The  relationship  of  the  threshold  determinations  to  comprehension 
measures  was  studied. 


Kxper iment  1 


MKT  HOI) 


Me t hod 


l’art icipant  s.  Young  military  enlisted  personnel  with  aptitude 
test  scores  of  at  least  110  (AFQT)  and  no  known  hearing  defects  par- 
ticipated in  the  research.  Thirty-two  individuals,  including  25  males 
and  7 females,  were  assigned  randomly  to  counterbalanced  orders  to  be 
described  later. 


Stimulus  Materials.  The  stimulus  materials  were  selected  from 
"Talking  Hooks,"  tape  recordings  prepared  at  the  Library  of  Congress, 
Division  for  the  Blind  and  Physically  Handicapped.  The  selected  re- 
cordings consisted  of  passages  from  a book  of  historical  portraits, 

The  Proud  Tower  by  Barbara  Tuchman  (196t>),  read  aloud  by  a female  voice 
at  the  average  rate  of  126  words  per  minute  (wpm)  and  recorded  at  a 
tape  speed  of  3.75  in. /sec  (9.525  cm/sec).  Eight  passages  were  used 
for  the  threshold  determinations. 

Apparat  us . A Crown  (800  series)  variable-speed  tap*'  recorder  was 
used  to  reproduce  speech.  This  recorder,  together  with  a speed  control 
device  (Crown  VSD-5) , produced  time-compressed  speech  by  the  speed- 
changing method  (speeded  speech).  An  AmBiChron  (ABC)  speech  compressor 
(Koch,  1974)  was  used  in  conjunction  with  the  above  equipment  to  pro- 
duce time-compressed  speech  by  the  sampling  method  (compressed  speech). 
The  AmBiChron  compressor  samples  speech,  writes  the  signal  into  a tem- 
porary memory,  and  reads  the  signal  from  memory  at  a rate  that  may  be 
different  from  the  writing  rate.  The  rate  of  writing  into  memory  is 
directly  proportional  to  the  tape  transport  speed  (which  refers  to 
speed  of  the  Crown  tape  recorder).  The  read-out  rate  is  constant.  In 
effect,  compression  discards  brief  segments  of  the  speech  signal,  while 
expansion  repeats  brief  segments.  This  device  produced  speech  with 
normal  pitch  despite  changing  speeds. 

Both  the  speed  of  the  tape  recorder  and  the  pitch  compensation  of 
the  AmBiChron  were  remotely  controlled  by  a laboratory- fabricated  de- 
vice. Details  of  the  instrumentation  are  available  elsewhere  (detlaan 
Schjelderup) . The  device  provided  for  an  initial  starting  speed  and 
constant  rates  of  acceleration  and  deceleration  that  were  selected  by 
the  adjustment  of  three  potentiometers.  Potentiometer  settings  had 
previously  been  calibrated  in  units  of  time  required  for  the  rapidity 
of  speech  to  double. 

The  voltage  applied  to  the  tape  recorder  by  the  control  circuits, 
which  determined  both  the  speed  of  the  tape  recorder  and  the  pitch  com- 
pensation of  the  AmBiChron,  was  displayed  on  a digital  voltmeter  and 
recorded  in  permanent  form  on  a 6-in. -wide  (15.24  cm)  strip  chart 
(Atomic  Accessor ies , Mode l SR320)  at  2 in. /sec  (5.08  cm/sec).  Another 
form  of  information  about  moment  ary  speed  of  speech  could  be  read  from 
a digital  frequency  counter.  All  speech  tapes  contained  one  channel 
of  a 1,000-Hu  tone  recorded  at  3.75  in. /sec  (9.525  cm/sec) . As  the 
speed  of  the  tape  varied,  the  frequency  changed  proportionately  and 
this  information  was  displayed  on  the  frequency  counter. 

Additional  equipment  included  a small  control  box  which  enabled 
the  individual  participant  to  set  the  intensity  and  rate  of  speech  (tin* 
latter  was  inactivated  during  the  experiment  proper),  a pair  of  Grason- 
Stadler  headphones  (Telephonies,  TDH  l'))  , and  a Urason-Stadlei  audiometei 
switch  used  to  select  either  acceleration  or  decelerat ion.  The  rates 

% 

of  acceleration  and  decelerat ion  were  selected  by  potentiometer  settings 
at  2.1,  4.2,  8.4,  and  16.8  wpm/sec  for  the  four  experimental  rates  ot 
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change.  These  values  represented  rates  at  which  speech  would  change 
fiom  t h«'  normal  rate  to  double  the  normal  rate  during  acceleration  (or 
vice  versa  during  deceleration)  in  60,  30,  15,  and  7.5  sec  respectively. 

Kxpet i mental  Design.  The  first  independent  variable  was  the  type 
of  time-compressed  speech.  There  wore  two  types,  one  of  which  was  pre- 
sented at  each  trial.  In  the  first  type,  rate  and  pitch  were  inter- 
locked and  both  were  determined  by  the  speed  of  the  tape  recorder.  This 
is  speeded,  or  type  S,  speech.  In  the  second  type,  pitch  was  held  con- 
stant while  rate  of  speech  was  varied.  This  is  compressed,  ot  type  C, 
speech. 


The  second  independent  variable  was  rate  of  change  of  speech 
speed,  of  which  there  were  four  levels.  At  each  trial,  speech  was  pre- 
sented at  one  of  four  constant  rates  of  change,  2.1,  4.2,  6.4,  or 
16.8  wfxtv/sec.  Whether  it  was  accelerating  or  decelerating  at  any  given 
moment  was  dependent  on  the  subject's  response. 

The  two  types  ot  compressed  speech  were  combined  with  the  four 
rates  of  change  to  yield  eight  experimental  conditions  untie r which 
thresholds  wore  determined.  All  subjects  were  exposed  to  the  eight 
conditions,  but  order  of  presentation  of  conditions  differed.  Speeded 
and  compressed  speech  were  presented  on  alternate  trials;  halt  t ho  sub- 
jects received  speeded  speech  first  , and  half  compressed  speech  first  . 
The  order  of  rate  ot  change  conditions  was  partially  counterbalanced 
with  half  the  subjects  receiving  rate  of  change  conditions  in  increas- 
ing order,  and  the  other  half  receiving  rate  of  change  conditions  in 
decreasing  order.  This  yielded  four  orders  of  presentation,  to  which 
subjects  were  randomly  assigned,  as  follows: 
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because  of  the  partial  counterbalancing  shown  above,  the  factor  of 
trials  was  not  formally  analyzed. 

Procedure . After  receiving  a brief  explanation  of  the  experiment, 
each  participant  was  taken  into  the  experimental  chamber,  fitted  with 
headphones,  and  presented  with  a small  control  box  with  t wo  knobs,  one 
controlling  volume  and  the  other  rate  of  speech.  The  tape  recorder  was 
turned  on,  and  the  participant  was  instructed  to  adjust  the  volume  con- 
trol to  his  own  comfortable  listening  level. 

Then  the  participant  was  made  fam  > iar  with  speeded  and  compressed 
speech.  The  experiment  demonstrated  the  function  of  the  rate  knob  by 
turning  it  through  a range  of  approximately  0.5  to  3.0  times  normal 
speed.  The  participant  was  then  instructed  to  adjust  the  knob  to  set 
the  speed  at  his  or  her  own  preference  level.  The  same  procedure  was 


repeated  with  compressed  speech.  After  complet ion  ot  this  short  famil- 
iarisation procedut e , the  control  box  was  sot  to  oim  side  and  the  rate 
control  was  inactivated. 

The  participants  were  then  individually  introduced  to  the  thicshold 
task.  A pushbutton  switch  (tlrason-Stadler  audiometer  switch)  was  iIciik Mi- 
st rated  while  the  task  was  explained  to  them.  They  were  told  that  the 
speech  would  get  taster  automat ica 1 ly  as  long  as  the  button  was  not 
pressed  and  that  it  would  get  slower  as  long  as  the  button  was  pressed. 
They  were  instructed  to  press  the  button  as  soon  as  it  became  t>>o  last 
to  understand  and  to  release  the  button  as  soon  as  they  could  understand 
it  again,  continuing  this  process  until  stopped  by  the  experimenter.  For 
most  part  ici|>ants,  no  more  instruct  ion  was  required,  although  a tew  re- 
quired  a repetition  and  emphasis  of  sons*  part  ot  the  instruct  ions.  bach 
pat t iclivrnt *s  threshold  was  determined  eight  times,  one  1-min  threshold 
being  determined  under  each  ot  the  eight  conditions  previously  described. 


Result s 

Kaw  data  were  recorded  in  analog  form.  The  unmet  ical  values  ot 
the  upper  and  lower  points  of  the  sawto.'t  h records  wete  tiansctihed  and 
multiplied  by  S.04  to  transform  voltage  into  words  per  minute.  All 
further  analysis  was  done  on  this  transformed  data.  The  mean  of  each 
record  represented  the  absolute  threshold,  while  the  difference  between 
means  of  the  upper  and  lower  points  represented  the  difference  limon. 

figure  1 presents  tin'  threshold  data  tot  both  speeded  and  com- 
pressed speech  at  each  rate  ot  acce lerut ion-dece let  at  ion.  The  differ- 
ence between  the  thresholds  for  speeded  and  compressed  speech  i s clearly 
evident  i the  mean  value  tor  compressed  speech  was  *'»>*' . '■>»«  wpm  while  that 
t .g  speeded  speech  was  .’17. /S  wpm.  An  analysis  ot  variance  revealed 
a significant  effect  ot  type  ot  speech  compression,  F(l,  .’It)  1'>1.'M, 

p ■ .01.  Although  the  curves  in  Figure  1 show  only  a slight  effect  of 
rate  of  change,  this  effect  was  st at ist ical ly  significant,  F(t,  84) 

4. OH,  p ■ .01.  The  interaction  ot  rate  with  order  ot  presentation, 

however,  was  also  significant,  F(4,  84)  - n.4i,  p .01.  Comparison 
.it  individual  means  by  Tukey's  llSO  test  revealed  that,  while  means  tot 
the  vat  ions  rates  m the  decreasing  ordei  group  did  not  ditter,  the 
means  for  the  two  fastest  rates  in  the  increasing  order  group  were 
higher  than  that  for  the  slowest  rate,  q(.’,  11-’)  - 1.0.’  and  t.r>,  re- 

spectively, p • .Oh.  This  observation  indicates  that  the  slight  rise 
in  the  curves  in  Figure  1 can  be  completely  accounted  tor  by  the  in- 
creasing order  ot  pr esent at  ion. 

The  mean  difference  limens,  which  coitosiKHid  to  t tie  amplitude  ot 
the  rate-swings  as  the  subtect  tracks  his  threshold,  are  shown  in 
Figure  at  each  rate  ot  aceolei  at  ion-dece  lerat  ion.  The  effect  ot 
type  of  speech  compression  is  attain  evident,  F(l,  J8)  74.8‘>,  p .01. 

In  contrast  to  the  rather  small  ettect  ot  rate  ot  change  on  the  abso- 
lute threshold,  the  analysis  revealed  a pi onouneed  effect  ot  rate  ot 
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change  on  the  difference  limen,  F(3,  84)  = 370.16,  p < .01.  Figure  2 
indicates  that  the  size  of  the  difference  limen  is  roughly  proportional 
to  the  rate  of  change. 


A greater  increase  in  the  difference  limen  for  compressed  versus 
speeded  speech  also  is  shown  in  Figure  2,  and  the  analysis  indicated  an 
interaction  of  type  of  speech  compression  with  rate  of  change,  F(3,  84) 

= 43.92,  p v .01.  Further  analysis  by  means  of  Tukey's  HSD  revealed 
that,  while  the  difference  between  the  pair  of  means  at  the  slowest  rate 
was  not  significant,  differences  between  the  pair  of  means  at  each  of 
the  three  higher  rates  were  significant  q(2,  112)  = 4.17,  9.87,  and 
19.00,  respectively,  p < .01.  Increasing  order  of  presentation  also 
yielded  higher  difference  limens  than  did  decreasing  order  of  presenta- 
tion, F(l,  28)  = 11.34,  p < .01,  and  there  was  an  interaction  between 
rate  and  order  of  presentation,  F(3,  84)  = 3.20,  p < .05. 


Experiment  2 


Method 


Participants . The  same  individuals  who  had  participated  in  Ex- 
periment 1 participated  in  Experiment  2. 


Stimulus  Materials.  Stimulus  materials  came  from  the  same  source 
as  those  in  Experiment  1,  but  consisted  of  different  passages,  seven  in 
all,  from  The  Proud  Tower.  Comprehension  tests  on  the  content  of  these 
passages  had  been  prepared  and  standardized  for  use  in  another  study 
(Lambert,  Shields,  Gade,  & Dressel).  These  tests  consisted  of  10  multiple- 
choice  questions  on  each  of  the  passages.  Thirty  individuals  from  the 
larger  standardization  group  of  Army  communication  trainees  had  been 


exposed  to  all  seven  passages.  These  30  individuals  comprise  the  stan- 
dardized control  group  for  the  present  study.  All  passages  had  been 
presented  to  the  control  group  at  normal  speed  (136  wpm) . 

For  the  experimental  group,  a tape  was  prepared  of  the  seven  pas- 
sages compressed  to  the  following  rates,  where  x indicates  "times  tne 
normal  speech  rate”;  1.50x  (203  wpm),  1.75x  (249  wpm),  2.00x  (284  wpm), 
2.25x  (306  wpm),  2 . 50x  (330  wpm),  2.75x  (360  wpm),  and  3.00x  (408  wpm). 

Apparatus.  The  same  apparatus  was  used  in  Experiment  1 and  Experi- 
ment 2.  The  compressed  speech  tape  was  prepared  with  the  AmBiChron  speech 
compressor,  and  reproduced  by  means  of  the  Crown  recorder  at  3.75  in. /sec 
(9.525  cm/sec). 

Procedure.  Participants  in  Experiment  1 were  given  a 5-  or  10-min 
rest  period  before  being  brought  back  to  the  laboratory  to  participate 
in  Experiment  2.  At  this  time,  the  purpose  of  the  comprehension  experi- 
ment was  explained  to  them.  They  were  told  that  seven  passages  would  be 
presented  to  them,  each  of  which  would  be  faster  than  the  preceding  one, 
the  final  passages  being  extremely  rapid.  They  were  also  told  that  a 
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multiple-choice  test  would  be  administered  following  each  passage  in 
which  they  would  be  expected  to  answer  all  questions.  Passages  were 
then  presented  one  at  a time,  each  followed  by  10  multiple-choice  ques- 
tions. Test  1 followed  the  passage  presented  at  1.5  x.  Test  2 the  pas- 
sage at  1.75x,  and  so  on,  through  Test  7 following  the  passage  at  3. Ox. 

• Results 

Table  1 shows  the  mean  comprehension  test  scores  for  both  experi- 
mental and  control  groups  for  each  of  the  seven  passages,  together  with 
the  compression  ratio  at  which  each  passage  was  presented.  The  differ- 
ences between  the  experimental  and  control  groups  are  also  shown.  These 
differences  were  found  by  subtracting  control  group  means  from  their  re- 
spective experimental  group  means.  Consequently,  a positive  difference 
indicates  that  the  experimental  group  was  superior  to  that  of  the  control 
group,  while  a negative  difference  indicates  that  the  experimental  group 
was  inferior  to  the  control  group. 

Figure  3 presents  the  group  means  of  the  experimental  and  control 
groups,  as  well  as  the  difference  between  the  respective  means  without 
regard  to  the  sign  of  the  difference.  It  may  be  observed  that  compre- 
hension of  the  experimental  group  decreased  as  the  compression  ratio  in- 
creased, but  that  mean  comp rehens ion  scores  of  certain  of  the  tests  in 
the  control  group  were  lower  than  that  of  others,  especially  Test  6.  In 
general,  comprehension  scores  of  the  experimental  group  were  reduced 
relative  to  the  control  group  as  the  compression  ratio  increased. 

The  comprehension  test  scores  were  subjected  to  a repeated  measures 
analysis  of  variance.  Experimental  and  control  group  means  were  signifi- 
cantly different,  F(l,  60)  = 7.90,  p < .01.  Tests  were  also  a signifi- 
cant source  of  variation,  F(6,  360)  = 12.51,  p < .01,  as  was  the  interaction 
of  tests  witli  conditions,  F(6,  360)  = 4.70,  p < .01.  Analysis  of  variance 
for  the  simple  main  effects  of  the  experimental  variable  yielded  the  fol- 
lowing values  of  F for  Tests  1 and  through  7,  respectively:  2.36,  2.87, 
1.04,  5.37,  12.07,  7.19,  and  11.92.  The  values  for  Tests  5,  6,  and  7 
were  significant  at  the  .01  level,  F(l,  60),  p .01,  while  the  value 
for  Test  4 was  significant  at  the  .05  level,  F(l,  60),  p < .05.  This 
indicates  that  the  experimental  group  exhibited  a significantly  lower 
degree  of  comprehension  than  the  control  group  when  the  word  rate  reached 
approximately  306  wpm. 

Of  primary  interest  was  the  correlation  of  the  comprehension  test 
scores  with  the  thresholds  determined  in  Experiment  1.  Pearson  product- 
moment  correlation  coefficients  were  obtained  between  each  of  the  eight 
threshold  conditions  and  the  seven  comprehension  tests.  The  two  highest 
correlations,  .55  and  .48,  respectively,  were  between  each  of  the  types 
of  speech  at  the  slowest  rate  of  change  (2.1  wpm/sec)  and  the  first  com- 
prehension test  (1.5x  normal  speed).  In  general,  the  other  correlations 
were  quite  low.  This  indicates  that  there  is  little  relationship  between 
thresholds  and  comprehension  test  scores  in  the  present  experiment. 


Reliabilities  of  the  comprehension  tests 
determined  by  Kuder-Richardson  Formula  20  and 
liability  coefficients  for  Tests  1 through  7, 
.66,  .62,  .76,  .46,  and  .44.  Undoubtedly  the 
thresholds  and  comprehension  scores  was  somewh 
moderate  reliabilities. 


for  the  control  group  were 
yielded  the  following  re- 
respecti vely : .56,  .55, 

relationship  between 
at  attenuated  by  these 


DISCUSSION 

This  study  was  undertaken  to  determine  the  relationship  of  the 
threshold  to  traditional  psychology  constructs  such  as  speech  intelli- 
gibility and  comprehension.  The  results  of  Experiment  1 support  the 
hypothesis  that  the  threshold  is  a measure  of  intelligibility  for  con- 
nected speech.  Garvey  (1953)  compared  the  intelligibility  of  single 
words  compressed  by  the  sampling  method  with  those  compressed  by  the 
speed-changing  method  and  found  a higher  percentage  of  intelligibility 
for  those  compressed  by  the  sampling  method.  This  finding  is  in  agree- 
ment with  the  results  of  the  present  experiment  on  connected  speech, 
which  found  a higher  threshold  for  compressed  than  for  speeded  speech. 

The  agreement  is,  in  fact,  rather  close.  In  the  present  experiment, 
the  thresholds  for  compressed  speech  and  speeded  speech  are  approximately 
2.1  and  1.7  times  the  normal  speech  rate,  respectively.  Assuming  that 
the  two  types  of  speech  are  equivalent  in  intelligibility  at  these  rates, 
a comparison  may  be  made  with  Garvey's  data  for  intelligibility  of  single 
words.  Garvey  found  that  approximately  95%  of  the  compressed  words  were 
intelligible  at  2.0  times  the  normal  speech  rate,  while  approximately 
90%  of  the  speeded  words  were  intelligible  at  1.7  times  the  normal  speech 
rate. 


The  results  of  this  experiment  also  indicated  that,  not  only  was 
compressed  speech  more  intelligible  than  speeded  speech,  but  changes  in 
the  intelligibility  of  compressed  speech  were  more  difficult  to  detect. 

This  may  be  the  result  of  cues  for  pitch  interacting  with  intelligibility 
in  the  case  of  speeded  speech  in  such  a way  that  the  detection  of  change 
is  made  easier. 

The  results  of  Experiment  2 do  not  support  the  hypothesis  that  the 
threshold  is  a measure  of  comprehension.  Correlations  between  threshold 
values  and  traditional  comprehension  test  scores  were  generally  low.  In 
view  of  the  lack  of  relationship  between  the  thresholds  and  traditional 
comprehension  measures,  one  fact  still  demands  explanation:  The  mean 
value  of  the  compressed  speech  threshold,  approximately  265  wpm,  is  in 
close  agreement  with  the  point  at  which  Foulke  and  Sticht  (1969)  have 
claimed  that  comprehension  falls  off  rapidly,  namely  250  to  275  wpm. 
Although  the  work  of  Carver  (1973a)  also  supports  the  notion  of  a threshold 
in  comprehension  measurement,  his  duration  measure  (seconds  per  word) 
placed  the  threshold  at  the  equivalent  of  150  wpm,  which  does  little  to 
explain  the  similarity  of  the  present  value  to  that  of  Foulke  and  Sticht. 
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Prior  to  the  collection  and  analysis  of  the  data,  it  was  thought 
that  it  might  be  possible  to  measure  comprehension  by  the  threshold 
method.  Nevertheless,  during  the  collection  of  the  data,  the  neutral 
term  "understand"  was  used  in  the  instructions  rather  than  references 
to  "comprehension"  or  "intelligibility,"  since  it  seemed  to  communicate 
the  task  to  the  participants.  In  retrospect,  it  should  be  said  that 
the  conditions  of  the  experiment,  particularly  rates  of  change  which 
involved  potential  doubling  of  the  speech  rate  in  7.5  to  60  sec,  opera- 
tionally defined  the  concept  and  perhaps  precluded  any  other  interpreta- 
tion, in  spite  of  differing  connotations  of  "understand"  which  individuals 
may  have  brought  to  the  laboratory. 

Deese  (1969)  has  suggested  that  understanding  is  a valid  psycho- 
logical construct.  Schwartz,  Sparkman,  and  Deese  (1970)  found  that 
subjective  judgments  of  the  comprehensibility  of  isolated  sentences 
could  be  validated  against  structural  complexity  of  the  sentences  or 
against  readability  indices.  Moreover,  the  work  by  Carver  (1973a,  1973b), 
was  based  on  the  percentage  of  thoughts  in  a passage  which  a listener 
judged  that  he  understood.  Deese  and  his  students  believe  that  during 
rapid  reading,  intermittent  interpretation  takes  place  rather  than  the 
full  process  necessary  for  comprehension.  Although  a person  may  not  in- 
terpret everything  he  reads  or  hears,  he  has  a kind  of  monitoring  device 
which  informs  him  that  he  has  the  ability  to  interpret  it.  Deese  has 
called  this  inward  sign  the  "feeling  of  understanding"  which  signals 
comprehensibility  rather  than  comprehension  per  se . From  all  of  the 
foregoing,  it  must  be  concluded  that  comprehensibility  is  an  alternative 
construct  that  should  be  considered. 

Which  of  these  three  alternatives  is  the  threshold  measuring?  There 
is  evidence  for  intelligibility  and  against  comprehension.  But  what 
about  comprehensibility?  The  notion  is  intuitively  appealing,  the  term 
"understand"  was  used  in  the  instructions  to  subjects,  and  comprehensi- 
bility would  appear  to  be  a construct  in  search  of  a measurement  method. 
Nevertheless,  without  independent  evidence  that  the  threshold  is  a measure 
of  comprehensibility,  it  would  seem  that  the  most  defensible  interpreta- 
tion of  the  threshold  at  present  is  that  it  is  a measure  of  intelligi- 
bility. Accordingly,  it  may  be  referred  to  as  the  "threshold  of 
intelligibility  of  rapid  connected  speech."  On  the  other  hand,  the  term 
"speech-perception  rapidity  threshold"  might  be  preferable,  since  it  is 
descriptive,  theoretically  neutral,  and  serves  to  distinguish  the  thresh- 
old from  those  based  on  loudness. 

Beyond  the  theoretical  considerations  concerning  the  psychological 
constructs  underlying  the  threshold  method,  research  should  be  directed 
toward  the  variables  to  which  the  method  may  be  sensitive.  It  has  al- 
ready been  demonstrated  that  the  method  is  sensitive  to  individual  dif- 
. ferences:  There  is  a wide  range  of  variation  in  threshold  values  among 

subjects.  If  the  threshold  method  is  a valid  way  of  measuring  the  un- 
derstanding of  rapid  speech,  the  implication  of  this  finding  is  that 
group  presentation  of  rapid  auditory  information  is  inappropriate. 
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Because  the  threshold  varies  from  one  person  to  another,  provisions 
should  be  made  to  allow  individuals  to  listen  at  their  own  rates. 

Research  should  also  be  directed  toward  invest igating  how  various 
characteristics  of  speech  materials  affect  the  threshold.  One  would  ex- 
pect that  the  threshold  might  well  vary  with  the  difficulty  of  speech 
material,  or  even  with  syntactic  and  semantic  variables  underlying  dif- 
ficulty. It  might  be  possible  to  use  the  threshold  to  determine  the 
1 istenabi 1 ity  of  auditory  material  just  as  readability  of  printed  mate- 
rial is  determined. 

The  threshold  method  may  also  be  sensitive  to  the  quality  of  speech 
produced  by  speech  compression  devices  and  speech  synthesizers.  Should 
this  prove  to  be  the  case,  one  would  expect  that  the  better  the  quality 
of  speech,  the  higher  would  be  the  threshold  values.  Thus,  the  method 
might  be  used  to  evaluate  and  compare  these  devices. 

It  is  assumed  that  the  threshold  basically  reflects  some  temporal 
limit  of  information  processing.  It  is  not  presently  known  whether  this 
limit  is  peculiar  to  the  auditory  modality  of  whether  more  central  pro- 
cesses are  involved.  Should  the  latter  prove  to  be  true,  research  on 
compressed  speech  may  considerably  elevate  the  status  of  listening  as 
compared  with  reading  as  a way  of  gaining  information. 
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